perm filename CH5[206,JMC] blob
sn#077895 filedate 1973-12-16 generic text, type C, neo UTF8
COMMENT ā VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 COMPILING IN LISP
C00008 ENDMK
Cā;
COMPILING IN LISP
Compiling is an important example of symbolic computation that
has received much study. A large part of this study has been devoted
to parsing which is essentially the transformation of an input string
in the source language into an internal form. The internal form used
depends on the compiler. Sometimes it is Polish prefix or postfix notation,
sometimes it is list structure, and sometimes it consists of entries in
a collection of tables.
When S-expression LISP is being compiled, the parsing is trivial,
because the input is S-expressions and the internal form wanted is
list structure, and therefore what parsing there is is done by the
ordinary LISP \F1read\F0 routine. Therefore, compilers can be very
compact and transparent in structure, and we can concentrate our attention
on the code generation phase. This is as it should be, because, in my
opinion, parsing is basically a side issue in compiling, and code
generation is the matter of main scientific interest.
We shall describe two compilers in this chapter called LCOM0 and
LCOM4 which compile S-expression LISP into machine language for the
PDP-10 computer according to the conventions of Stanford LISP 1.6.
For the present, we shall take these conventions for granted, although
in a later section, we shall discuss how they might be improved. Before
we describe the compilers, we must describe these conventions.
First of all, the target language is called LAP for LISP assembly
program. Each function is compiled separately into a separate LAP
program, and these programs are written onto a disk file. These files
can later be read into a LISP core image and assembled in place by
the LAP assembler which is part of the LISP runtime routines. The compiler,
on the other hand, is a separate LISP core image that can be instructed
to compile several input files and will produce output files of the
same name and file extension LAP. All this is specific to the PDP-10
time-sharing system and fortunately works the same whether the time-sharing
system is the D.E.C. system, the Stanford system, or TENEX.
Readers who do not use these time-sharing systems should relax; no knowledge
of them will be required.
The LAP program produced is a list of words;
the last word is NIL, and the first
word is a header of the form (LAP \F1fname\F0 SUBR) where \F1fname\F0
is the name of the function compiled. This header tells the DSKIN
program that it should read S-expressions till it comes to NIL and then
submit what it has collected to the LAP assembler which will assemble it
into core. The rest of the words are either atoms representing labels
or lists representing single PDP-10 instructions.
Before proceeding further, we need to say something about machine
language programs in general and about the PDP-10 in particular for the
benefit of those who are not familiar with these topics. However, it should
be understood that this will concern machine language programs in general and
has nothing special to do with computing with symbolic expressions in general
or LISP in particular.
DIGRESSION ON MACHINE LANGUAGE PROGRAMS
The hardware of a computer includes the processor, the memory, and
the input-output equipment. In this discussion we will ignore the input-output
equipment since the compiled LISP code will operate entirely within the
memory of the computer. The memory of a computer is divided into what we
may call hardware words each of which contains a fixed number of bits.
This is probably necessary for the hardware designer, and, in any case,
is true of all the computers with which I am familiar. The hardware word
is the unit of information transferred between memory and the processor of
the computer.